Bayesian Linear Regression

Summer Dahlen, Shijie Geng, Mary Morrow, Kayla Mota

4/15/24

Overview

  • Introduction
    • What is Bayesian Linear Regression?
    • How it works
    • Benefits of BLR
  • Methods
    • Regression Analysis Basics
    • BLR Basics
  • Analysis
    • Exploration Data Analysis
    • BLR Analysis
  • Conclusion

Introduction

What is Bayesian Linear Regression?

Commonly denoted as “BLR”, it is a conditional data analytical method for understanding the relationship between independent and dependent variables based on Bayes Theorem while handling uncertainties with accuracy and interpretability[@walker2007application].

How it works

  1. Predict prior values
  2. Input collected data
  3. Obtain posterior distribution (MCMC Method)
  4. Sensitivity analysis
  5. Skewness and heteroscedasticity check (SSSM)

  • The Markov Chain Monte Carlo (MCMC) method involves simulating random draws from the posterior distribution, allowing for exploration of the parameter space and estimation of the posterior distribution even in complex, high-dimensional models [@robert2018accelerating; @rojas2020lose].
  • Confirming the convergence of MCMC chains is important for the reliability of the results.
  • Evidence of convergence provides confidence in the validity of the samples[@marcotte2018gibbs].
  • After obtaining the posterior results, inferences can be used for the prior distribution of the current/next experiment [@rubio2016bayesian].
    • Posterior probabilities are assigned to the spectrum of values that parameters can assume.
    • The more prominent the peak, the more effective it is as an estimate because a single value has a higher probability compared to other values.
      • The credibility interval, indicating a 95% range of probable parameter estimates, is narrower.
      • Conversely, when the peak is less pronounced, other estimates are also plausible, resulting in a wider credibility interval for the 95% range of probable parameter estimates [@shetty2013evidence; @gill2002bayesian].
  • In BLR, even when utilizing representative or informed priors, it is crucial to perform a sensitivity analysis to assess the impact of different prior specifications on the posterior results.
  • This aims to ensure that the findings are not dependent on the choice of prior [@kruschke2021bayesian].
  • Once the experiment is conducted and the posterior distribution is obtained, the next step is to conduct a check on skewness of the distributions, as well as heteroscedasticity.
  • Although scale mixtures of normal are often used in regression analysis to capture heteroscedasticity and outliers, skew-symmetric scale mixtures of normal (SSSM) can capture asymmetry, making it more robust to use in BLR [@rubio2016bayesian].

Benefits of BLR

  • More flexible
    • Captures all uncertain random variables
    • Good for small sample sizes
    • Good for various data types
  • Eliminates need for p-values
    • Richer information on parameters
    • Allows for simultaneous single change point detection in a multivariate sample

Use cases of BLR

  • Construction
  • Transportation Engineering
  • Biomedicine

Methods

Regression Analysis Basics

Regression analysis is a statistical method that allows to examine the relationship between two or more variables of interest. Suppose the response variable \(Y\) is a function of several predictor variables, \(x_{i1}, x_{i2},..., x_{ik}\).

To fit a model:

\[ Y = X \beta + \epsilon \quad(1) \] where:

  • \(Y = (y_1, y_2,..., y_n)\) is the n×1 vector of response
  • \(X\) is a n×p design matrix, \(p = k+1\)
  • \(\beta = (\beta_0, \beta_1,...., \beta_{k})\)is the (k+1)×1 vector of parameters
  • \(\epsilon = (\epsilon_1, \epsilon_2,...., \epsilon_n)\) the n×1 vector of errors, \(\epsilon\) ~ \(N_n(0, \sigma^2I)\)

The likelihood function of \(y\) given \(\beta\) and \(\sigma^2\) is defined as follows

\[ f(Y_i = y_i|x, \beta, \sigma^2) = f_y(y|x,\beta,\sigma^2) = (2\pi\sigma^2)^{-\frac{n}{2}} exp^{-\frac{1}{2\sigma^2}(y-x\beta)^T(y-x\beta)} \quad(2) \] which describes the probability of observing the data \(y\) given the predictors \(x\) and the parameters \(\beta\) and \(\sigma^2\).

To make an inference, the \(\beta\) parameters are determined through maximizing the likelihood function.

BLR Basics

BLR is approached through probability distributions. Instead of estimating the response variable, y, as a single value, it’s drawn from a probability distribution. When using BLR, the model assumes that the response follows a normal distribution, expressed as

\[ y \sim N(\beta^TX, \sigma^2I) \quad(3) \]

where, y is assumed as a normal distribution centered by its mean and variance.

Additionally, the parameters in the model are generated from a probability distribution. The posterior probability of these parameters is proportional to the prior and likelihood [@minka2000bayesian].

\[ P(\beta|y, X) = \frac{P(y|\beta, X) × P(\beta|X)}{P(y|X)} \quad(4) \]

In Bayesian statistics, when the prior distribution and the posterior distribution belong to the same family of probability distributions, they are called conjugate distributions. Using Bayes’ theorem, we have the joint posterior distribution as follows:

\[ f(\beta, \sigma^2|y,x) = \frac{f_y(y|x, \beta,\sigma^2)f(\beta|\sigma^2,m,V)f(\sigma^2|a,b)}{f_y(y)} \quad(5) \]

where:

  • \(f_y(y|x, \beta,\sigma^2)\) is the likelihood function,

  • \(f(\beta|\sigma^2,m,V)\) isthe prior distribution for coefficients,

  • \(f(\sigma^2|a,b)\) is the prior distribution for the error variance, and

  • \(f_y(y)\) is the marginal likelihood.

A non-conjugate prior refers to a prior distribution that does not belong to the same family of distributions as the posterior distribution after observing data. In such cases, the posterior distribution can be expressed as

\[ f(\beta|y, X, \sigma^2) ∝ f_y(y|X, \beta)\prod_{i=1}^{n}f(\beta_i|m_i,v_i) \quad(6) \] where:

  • \(f(\beta|y, X, \sigma^2)\) is the posterior distribution of coefficients,

  • \(f_y(y|X, \beta)\) is the likelihood function, and

  • \(f(\beta_i|m_i,v_i)\) is the prior distribution of each coefficient.

Analysis

Necessary packages required

library(tidyverse)
library(gtsummary)
library(skimr)
library(randomForest)
library(GGally)
library(corrplot)
library(car)
library(ggrepel)
library(rstan)
library(rstanarm)
library(bayesplot)

Dataset Introduction

The selected Life Expectancy dataset comprises of 21 health, economic, and social factors taken from 179 countries between years 2000 and 2015.

# Read CSV file directly from a website URL

url = "C:/Users/shiji/OneDrive/桌面/STA6257_capstone/Life-Expectancy-Data-Updated.csv"
ledata = read.csv(url, header = TRUE)

names(ledata) <- c("Country", 'Region',"Year", "InfDth","UnFiveDths","AdultMort", "Alcohol","HepB","Measles","BMI","Polio", "Diphtheria", "HIV", "GDP", "population","ThinAd","ThinChild","Schooling",'Developing','Developed', "LifeExp")
str(ledata)
'data.frame':   2864 obs. of  21 variables:
 $ Country   : chr  "Turkiye" "Spain" "India" "Guyana" ...
 $ Region    : chr  "Middle East" "European Union" "Asia" "South America" ...
 $ Year      : int  2015 2015 2007 2006 2012 2006 2015 2000 2001 2008 ...
 $ InfDth    : num  11.1 2.7 51.5 32.8 3.4 9.8 6.6 8.7 22 15.3 ...
 $ UnFiveDths: num  13 3.3 67.9 40.5 4.3 11.2 8.2 10.1 26.1 17.8 ...
 $ AdultMort : num  105.8 57.9 201.1 222.2 58 ...
 $ Alcohol   : num  1.32 10.35 1.57 5.68 2.89 ...
 $ HepB      : int  97 97 60 93 97 88 97 88 97 97 ...
 $ Measles   : int  65 94 35 74 89 86 97 99 87 92 ...
 $ BMI       : num  27.8 26 21.2 25.3 27 26.4 26.2 25.9 27.9 26.5 ...
 $ Polio     : int  97 97 67 92 94 89 97 99 97 96 ...
 $ Diphtheria: int  97 97 64 93 94 89 97 99 99 90 ...
 $ HIV       : num  0.08 0.09 0.13 0.79 0.08 0.16 0.08 0.08 0.13 0.43 ...
 $ GDP       : int  11006 25742 1076 4146 33995 9110 9313 8971 3708 2235 ...
 $ population: num  78.53 46.44 1183.21 0.75 7.91 ...
 $ ThinAd    : num  4.9 0.6 27.1 5.7 1.2 2 2.3 2.3 4 2.9 ...
 $ ThinChild : num  4.8 0.5 28 5.5 1.1 1.9 2.3 2.3 3.9 3.1 ...
 $ Schooling : num  7.8 9.7 5 7.9 12.8 7.9 12 10.2 9.6 10.9 ...
 $ Developing: int  0 1 0 0 1 0 0 1 0 0 ...
 $ Developed : int  1 0 1 1 0 1 1 0 1 1 ...
 $ LifeExp   : num  76.5 82.8 65.4 67 81.7 78.2 71.2 71.2 71.9 68.7 ...

Exploration Data Analysis

Let’s look at the distribution of life expectancy and the discrepancies of life expectancy in all regions over the years.

# Distribution of Life Expectancy (years)

hist(ledata$LifeExp, 
     main = "Distribution of Life Expectancy (years)",
     xlab = "Life Expectancy",
     border = "white", 
     col = "lightblue",
     ylim = c(0, 800))

# Create a line plot for max, mean, and min life expectancy in all regions over the years
life_expectancy_summary <- ledata %>%
  group_by(Year, Region) %>%
  summarise(
    mean_life_expectancy = mean(LifeExp) )
ggplot(life_expectancy_summary, aes(x = Year)) +
  geom_line(aes(y = mean_life_expectancy, color = Region), linetype = "solid") +
  labs(title = "Regional Life Expectancy by Year (years)", x = "Year", y = "Life Expectancy", color = "Region") +
  theme_minimal() +
  scale_color_brewer(palette = "Set1") 

# Life Expectancy by Economy Status and Region
ggplot(ledata, aes(x = factor(Developing), y = LifeExp, fill = factor(Developed))) +
  geom_bar(stat = "summary", fun = "mean", position = "dodge", color = "black", width = 0.7) +
  labs(title = "Life Expectancy by Economy Status and Region", x = "Economy Status", y = "Mean Life Expectancy") +
  scale_fill_manual(values = c("skyblue", "salmon"), labels = c("Developing", "Developed")) +
  theme_minimal() +
  theme(legend.title = element_blank()) +
  facet_wrap(~ Region)

What is the impact of schooling on the lifespan of humans?

How does Infant and Adult mortality rates affect life expectancy?

Let’s continue visualizing data. :)

Does Life Expectancy has positive or negative correlation with Diphtheria Immunization coverage, GDP, Body Mass Index etc.?

Finally, let’s see how much these predictors are related to lifespan and the intercorrelations among two or more predictors.

    InfDth UnFiveDths  AdultMort    Alcohol       HepB    Measles        BMI 
 44.329144  44.469991   7.618888   1.939630   2.560952   1.587422   2.627136 
     Polio Diphtheria        HIV        GDP population     ThinAd  ThinChild 
 11.990081  12.940367   2.890280   2.016715   1.150097   8.949993   8.936053 
 Schooling 
  4.273524 

Bayesian Linear Regression Analysis

  • Create a BLR model

  • Evaluate the performance of BLR model

  • Compare with Frenquentist method

Create a BLR model

Our dataset comprises 2,864 observations, consisting of six predictors and one response variable.

In this analysis, we employ the stan_glm() function from the rstan package to establish a Bayesian Linear Regression model.


SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
Chain 1: 
Chain 1: Gradient evaluation took 0.000139 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 1.39 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1: 
Chain 1: 
Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 1: 
Chain 1:  Elapsed Time: 0.18 seconds (Warm-up)
Chain 1:                0.299 seconds (Sampling)
Chain 1:                0.479 seconds (Total)
Chain 1: 

SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
Chain 2: 
Chain 2: Gradient evaluation took 2.8e-05 seconds
Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.28 seconds.
Chain 2: Adjust your expectations accordingly!
Chain 2: 
Chain 2: 
Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 2: 
Chain 2:  Elapsed Time: 0.12 seconds (Warm-up)
Chain 2:                0.296 seconds (Sampling)
Chain 2:                0.416 seconds (Total)
Chain 2: 

SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
Chain 3: 
Chain 3: Gradient evaluation took 2.5e-05 seconds
Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.25 seconds.
Chain 3: Adjust your expectations accordingly!
Chain 3: 
Chain 3: 
Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 3: 
Chain 3:  Elapsed Time: 0.183 seconds (Warm-up)
Chain 3:                0.296 seconds (Sampling)
Chain 3:                0.479 seconds (Total)
Chain 3: 

SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
Chain 4: 
Chain 4: Gradient evaluation took 3.2e-05 seconds
Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.32 seconds.
Chain 4: Adjust your expectations accordingly!
Chain 4: 
Chain 4: 
Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 4: 
Chain 4:  Elapsed Time: 0.199 seconds (Warm-up)
Chain 4:                0.297 seconds (Sampling)
Chain 4:                0.496 seconds (Total)
Chain 4: 
stan_glm
 family:       gaussian [identity]
 formula:      LifeExp ~ .
 observations: 2864
 predictors:   6
------
            Median MAD_SD
(Intercept) 37.5    1.1  
GDP          0.0    0.0  
BMI          0.9    0.1  
Alcohol      0.0    0.0  
Schooling    1.2    0.0  
HIV         -1.6    0.0  

Auxiliary parameter(s):
      Median MAD_SD
sigma 4.6    0.1   

------
* For help interpreting the printed output see ?print.stanreg
* For info on the priors used see ?prior_summary.stanreg

Upon reviewing the summary, we discern parameter estimates for the model. Notably, the model’s equation can be represented as follows:

\[LifeExp =\]

\[37.5 + 0.9×BMI + 1.2×Schooling - 1.6×HIV \]

Evaluate the BLR Model

The stan_trace() was used to visualize the traces of parameters estimated by Markov chain Monte Carlo (MCMC) sampling.

Each trace plot to each variable appears centered around one value and all chains are well overlapping around one value.


Model Info:
 function:     stan_glm
 family:       gaussian [identity]
 formula:      LifeExp ~ .
 algorithm:    sampling
 sample:       4000 (posterior sample size)
 priors:       see help('prior_summary')
 observations: 2864
 predictors:   6

Estimates:
              mean   sd   10%   50%   90%
(Intercept) 37.5    1.1 36.1  37.5  39.0 
GDP          0.0    0.0  0.0   0.0   0.0 
BMI          0.9    0.1  0.8   0.9   1.0 
Alcohol      0.0    0.0 -0.1   0.0   0.0 
Schooling    1.2    0.0  1.1   1.2   1.3 
HIV         -1.6    0.0 -1.6  -1.6  -1.5 
sigma        4.6    0.1  4.5   4.6   4.7 

Fit Diagnostics:
           mean   sd   10%   50%   90%
mean_PPD 68.9    0.1 68.7  68.9  69.0 

The mean_ppd is the sample average posterior predictive distribution of the outcome variable (for details see help('summary.stanreg')).

MCMC diagnostics
              mcse Rhat n_eff
(Intercept)   0.0  1.0  3758 
GDP           0.0  1.0  3832 
BMI           0.0  1.0  3561 
Alcohol       0.0  1.0  3902 
Schooling     0.0  1.0  3069 
HIV           0.0  1.0  4809 
sigma         0.0  1.0  3563 
mean_PPD      0.0  1.0  3787 
log-posterior 0.0  1.0  1849 

For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).

\(\hat{R}\) compares the variability in sampled parameter estimates across all chains combined with the variability within each individual change.

\(R = 1\) shows stability across chains.

Evaluate the BLR Model

The dark blue line shows the observed data while the light blue lines are simulations from the posterior predictive distribution.

Although the peaks of both the observed data distribution and the posterior distribution align along the x-axis, there’s a notable discrepancy in the density values along the y-axis.

Evaluate the BLR Model

The histograms exhibit the distribution of posterior samples obtained from the model.

In order to calculate the the mean and median values of each parameter, we extracted posterior samples from the model and converted to a data frame .

Mean calculations

(Intercept)     Alcohol         GDP         BMI   Schooling         HIV 
   37.54522    -0.04726     0.00012     0.89498     1.19895    -1.58586 
      sigma 
    4.58714 

Median calculations

(Intercept)     Alcohol         GDP         BMI   Schooling         HIV 
   37.52769    -0.04706     0.00012     0.89646     1.19960    -1.58576 
      sigma 
    4.58648 

We analyzed how the observed data has changed the parameter estimates.

Priors for model 'glm_post1' 
------
Intercept (after predictors centered)
  Specified prior:
    ~ normal(location = 69, scale = 2.5)
  Adjusted prior:
    ~ normal(location = 69, scale = 24)

Coefficients
  Specified prior:
    ~ normal(location = [0,0,0,...], scale = [2.5,2.5,2.5,...])
  Adjusted prior:
    ~ normal(location = [0,0,0,...], scale = [ 0.0014,10.7179, 5.9052,...])

Auxiliary (sigma)
  Specified prior:
    ~ exponential(rate = 1)
  Adjusted prior:
    ~ exponential(rate = 0.11)
------
See help('prior_summary.stanreg') for more details

Compare with Frenquentist Method

Resulting Model \(LifeExp = 37.5 + 0.9×BMI + 1.2×Schooling + -1.6×HIV\).


Call:
glm(formula = LifeExp ~ ., family = gaussian, data = BLR_data)

Deviance Residuals: 
     Min        1Q    Median        3Q       Max  
-17.9434   -3.2096    0.1765    3.5416   16.8603  

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  3.752e+01  1.137e+00  33.006   <2e-16 ***
GDP          1.216e-04  6.297e-06  19.319   <2e-16 ***
BMI          8.964e-01  5.143e-02  17.430   <2e-16 ***
Alcohol     -4.680e-02  2.818e-02  -1.661   0.0968 .  
Schooling    1.198e+00  4.699e-02  25.488   <2e-16 ***
HIV         -1.586e+00  3.713e-02 -42.725   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for gaussian family taken to be 21.01963)

    Null deviance: 253277  on 2863  degrees of freedom
Residual deviance:  60074  on 2858  degrees of freedom
AIC: 16858

Number of Fisher Scoring iterations: 2

The maximum likelihood estimates of the intercept and slopes are nearly identical to the mean values of posterior samples obtained from Bayesian Linear Regression.

Bayesian Linear Regression

(Intercept)     Alcohol         GDP         BMI   Schooling         HIV 
   37.54522    -0.04726     0.00012     0.89498     1.19895    -1.58586 
      sigma 
    4.58714 

Frenquentist method

(Intercept)         GDP         BMI     Alcohol   Schooling         HIV 
   37.51720     0.00012     0.89637    -0.04680     1.19761    -1.58617 

Conclusion

Notable Observations

In this project, we successfully explored the effects of the predictors on Life expectancy and relationships among them. There were some notable observations while conducting our analysis.

  • There was a significant negative relationship between adult mortality, infants deaths, deaths under five years old, and life expectancy. As the rates of death of adults, infants, and children under 5 years old go up, life expectancy tends to shorten.

  • The rates of thinness of adolescents and adults are also critical factors affecting life expectancy, and there is a negative relationship existing between them.

  • Life expectancy varies depending on the disparity in years of schooling across regions. Generally, life expectancy tends to rise with increase of schooling and higher level of economy development since 2000.

  • Due to the presence of correlation and multicollinearity, we selected only six variables as predictors for conducting Bayesian Linear Regression Analysis.

Bayesian Linear Regression Analysis Conclusion

  • Our model was successful in predicting life expectancy when considering the variables of GDP, BMI, Alcohol, schooling and HIV.

  • The convergence of four chains is strong, and it’s centered around a single value, indicating the excellent performance of the model.

  • Through observing the BLR model, the variables of BMI, Schooling, and HIV have more weight on predicting life expectancy.

Questions?